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We define a new family of random spin models with one-dimensional structure, finite-range 
multi-spin interactions, and bounded average degree (number of interactions in which each 
spin participates). Unfrustrated ground states can be described as solutions of a sparse, 
band diagonal linear system, thus allowing for efhcient numerical analysis. 

In the limit of infinite interaction range, we recover the so-called XORSAT (diluted p- 
spin) model, that is known to undergo a random first order phase transition as the average 
degree is increased. Here we investigate the most important consequences of a large but finite 
interaction range: (z) Fluctuation-induced corrections to thermodynamic quantities; (m) The 
need of an inhomogcneous (position dependent) order parameter; (Hi) The emergence of a 
finite mosaic length scale. In particular, we study the correlation length divergence at the 
(mean-field) glass transition. 

PACS numbers: 64.70.Pf (Glass transitions), 75.10.Nr (Spin-glass and other random models), 
89.20.Ff (Computer science) 

I. INTRODUCTION 

A large class of disordered mean field spin models exhibit a behavior that is reminiscent of the 
structural glass transition in fragile glasses . As temperature is lowered, they undergo a 

'dynamical phase transition' characterized by a diverging relaxation time at a critical temperature 
T(i. The reason for such a dynamical arrest can in turn be ascribed to ergodicity breaking: below 
the Boltzmann measure decomposes into an exponential number of pure states. While equilibration 
is fast within each state, it takes an exponentially large (in the system size) time to equilibrate 
across states. 

Below Td, the system can be meaningfully characterized through its complexity S, which gives 
the exponential growth rate of the number of pure states (i.e. the number of such states is about 
givs^ jy i^gij^g the size). The complexity decreases as temperature is further lowered, and vanishes 
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linearly at a second (static) transition temperature Tg. This corresponds to an actual thermody- 
namic phase transition. 

A strikingly similar scenario has been found to hold in a large array of random constraint 
satisfaction [4, problems of interest in theoretical computer science . The role of temperature 
is played here by the number of constraints per variable 7, while Boltzmann distribution is replaced 
by the uniform measure over solutions of the problems. As the constraint density crosses a critical 
value 7d, the set of solutions splits into 'lumps' analogous to pure states. Above a second threshold 
7s the set of constraints becomes with high probability unsatisfiable. 

In the last few years there has been a consistent effort in interpreting disordered mean field 
models as a genuine mean field theory for the structural glass transition. This is highly non-trivial 
since in any finite-dimensional model there cannot be coexistence of an exponentially large number 
of pure states. Imagine trying to select one such state through appropriate boundary conditions 
on a box of size i. This will imply an energetic bias towards the selected state, which is of order 
ai^, where O<0<(i— lisa surface tension exponent. On the other hand, the entropic advantage 
of the other states is of order T,£ , because of their number. Therefore, for i ^ is = ''"^ j 

pure states are no longer stable. 

According to the 'mosaic state ' scenario, below a typical configuration of the system can be 
described as a patchwork P, 0, Q]. Each patch corresponds to the configuration being close to a 
particular pure state in a localized region whose length scale is is- Since S ~ (T — Tg) at the static 
transition, the mosaic lengthscale diverges as is ^ {T — Ts)~^ with v = l/{d — 9). 

While the mosaic scenario is appealing, its consistency and implications, as well as its precise 
meaning, are far from obvious. An important step forward was achieved in [l^ where a concrete 



111] as 



"gedanken experiment" was introduced to define is- This length scale was interpreted in 
a point-to-set correlation length, and its divergence was rigorously proved to be equivalent to a 
divergent correlation time. In Ij], ig was actually shown to diverge at Tg in a class of disordered 



Kac models with continuous scalar spins. 

Unhappily the models considered in [l^ can currently be treated only in the Kac limit, and 
through somewhat formal techniques such as the replica method. As a consequence, many interest- 
ing questions (such as the relevance of this limit for realistic interaction ranges, non-perturbative 
fluctuation effects, a precise definition of states) cannot be addressed in this context. The present 



^ In a typical constraint satisfaction problems, one seeks an assignment of values to L discrete variables in such a 
way to satisfy M constraints. 
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japer aims at introducing a new class of models that share some features with the ones treated in 



13], while being tractable within alternative approaches (e.g. numerically). 



We follow the route of generalizing one of the ensembles of random constraint satisfaction 
problems mentioned above, and referred to as A;-XORSAT Q,Q. We shah require constraints 
to have finite range with respect to an underlying one-dimensional geometry. Our motivation is 
twofold: (i) Because of its underlying linear structure, the /c-XORSAT is very well understood. In 
particular a wealth of informations regarding pure states and their geometry is accessible through 



rigorous techniques 
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18l |: {a) The ensembles of random constraint satisfaction problems 
studied within the computer science and statistical mechanics communities have lacked so far any 
geometrical structure (in physics terms, they are mean field models). This is of course a poor 
cartoon of real world instances, and it is surely instructive to explore alternative -structured- 
models. 



Constraint satisfaction problems with finite interaction range were already considered in [19[], 
without however considering the interaction range as a parameter. Further, the most im por tant 
questions that we shall consider in this paper were not studied there. Several papers 2ol. Lll. [3] 
investigated the behavior of thermodynamic quantities and local order in Kac spin glasses. Finally a 
one-dimensional Kac spin glass, with a different (continuous) phase transition was recently studied 



numerically in 



23|. 



The paper is organized as follows. In Section |TI] we introduce our Kac-XORSAT model, and 
its variants, and discuss some of their most basic properties in Section IIII[ We investigate ther- 
modynamic quantities (in particular the ground state entropy) in Section ITVl and the correlation 
length divergence in Section |Vl Finally a discussion of our results is presented in IVI) and several 
technical details are contained in the Appendices. 



II. DEFINITION OF THE MODEL 



Let us recall that an instance of the A:-XORSAT problem is defined by an M x L matrix binary 
H, with row weight^ k and a binary vector b of length M. Solving such an instance requires 
determining whether the linear system 

mx = b mod 2, (1) 



The row weight is the number of non-vanishing entries in each row of the matrix. 
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admits a solution x G {O, l}^. This question is equivalent to asking whether the ground state 
energy of a certain Ising spin model, is zero or not. More precisely, let {zi(a), . . . , ^^(a)} C [L] 
denote the indices of the non-zero entries in the a-th row of M, and Ja = {-if'' (here a e [M], 
and [n] = {1, . . . The relevant spin model is defined by letting the energy of configuration 

£= (fTi,...,cJL) e {+1,-1}-^ be 

M 

E{g.) = ^{^- Ja(^ii{a) ■ ■ ■ (^i,{a)) ■ (2) 
a=l 

In the following we shall refer to a particular XORSAT instance as to a 'formula' or a 'sample'. 

The random A;-XORSAT (rXOR) ensemble is defined by letting EI be a uniformly random binary 
matrix (with dimensions M x L and row weight k) and b a uniformly random vector in {O, l}^. It 
is also useful to consider the unfrustrated random A;-XORSAT ensemble, defined by setting b = 
(the all O's vector) deterministically. Such an ensemble exhibits a particularly rich behavior in the 
'thermodynamic' limit L ^ oo, M — s- cx3 with 7 = M/L kept fixed. 

The Kac k-XORSAT (KacXOR) ensembles add to the above features a one-dimensional (or, 
in linear algebra terms, a band diagonal) structure. One such ensemble is characterized by the 
parameters introduced so far, namely k, L, and 7, plus an 'interaction range' R. Unlike in the 
rXOR ensemble, 7 is required to be in [0,1], although generalizations are not difficult. Further, 
the interaction range is an integer such that 2R + 1 > k. Given such parameters, the matrix H 
is sampled as follows. Rows of EI are indexed by a subset F of [L]: for each a = 1, . . . , L, a (z F 
independently from the others with probability 7. In particular, the number of rows of EI, M, is a 
binomial random variable 



F{M} 



07^^(1 (3) 



As L — > cx), the number of rows is with high probability^, close to L7. For each a £ F, the 
corresponding row in EI is sampled independently from the others by letting the indices of non-zero 
entries (ii(a), . . . , ifc(o)) be a uniformly random subset of {a — i?, . . . ,a + R} (i.e. each of the (^^^) 
subsets has the same probability). We shall refer to {a — i?, . . . , a + i?} as to the range of equation 
a. 

Finally, we let the entries of b be indexed by F as well, and iid random in {0, 1}. As in the case 
of random XORSAT, some simplification is achieved by considering an unfrustrated ensemble in 
which 6 = 0. 



^ Following the use in probability theory, we say that something happens with high probability (w.h.p.) if its 
probability approaches 1 in the thermodynamic limit. 
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2R+1 

FIG. 1: Factor graph representation of a portion of a KacXOR formula with fc = 3 and i? = 3. Empty 
circles correspond to variables (columns of H) and filled squares to equations (rows of H) . 



A XORSAT formula admits a natm'al representation as a factor graph G. This is a bipartite 
graph including one 'parity check node' for each row in H (i.e. for each equation in the linear 
system), and one 'variable node' for each column (i.e. for each variable in the linear system). 
A parity check and a variable node are connected by an edge if the corresponding entry of H is 
non- vanishing. An example of such representation is presented in Fig. [TJ 

There is still one point of the above definition to be clarified. When a<RoTa>L — R, the 
range for equation a is not included in the sets of variable indices, and it might be that ii{a) < or 
il{a) > L. We shall consider two types of boundary conditions. With periodic boundary conditions, 
variable indices are interpreted modulo L. Therefore, if for some index we have i/(a) > L, this is 
identified with ii{a) — L, while, if ii{a) < 0, this is identified with i/(a) + L. 

In the case of fixed boundary condition we will let the set potential indices of row of IH be 
{—R + 1, . . . , L + R}. Namely F includes each a in this set independently with probability 7. To 
define a fixed boundary condition, we shall fix a doubly infinite reference configuration^ x^^^ = 
{xf'^ : i S Z}. If in building row a we get an index ii{a) [L], the corresponding non-zero entry is 
not included in H, but rather the value x^^^^^ is added to ba- This corresponds to fixing x = x^^^ 
'outside' {1, . . . ,L}. Finally, we shall agree that, whenever considering the unfrustrated ensemble, 
the reference configuration will be the all O's sequence x^'^^ = 0. 



Notice that the boundary condition depends on the reference configuration a;^"' only through a;^2^_|_j, . . . , Xq'' and 



(0) 



•^i+H ■ ■ ■ > ■^L+2R- 
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III. FRUSTRATED VERSUS UNFRUSTRATED ENSEMBLE 

The most important feature of the rXOR ensemble in the large size limit is that it undergoes a 
SAT-UNSAT phase transition at well defined constraint density 7s (^)- More precisely, a random 



XORSAT formula of the type ([T]) is solvab 



not solvable (UNSAT) for 7 > 7s (/c) 



3, 



e (SAT) with high probability if 7 < Jsik), while it is 



mm- 

It is a convenient feature of XORSAT that this phase transition can be studied by considering 
uniquely the unfrustrated ensemble. This simplification can be explained through the well-known 
identity 

P{IHIx = 6 is SAT} = E{2^^^V^(H)} , (4) 

where Z{M) denotes the number of solutions of the homogeneous linear system Mx = mod 2. 
The identity holds irrespective of distribution of H provided the right hand side of Eq. ([1]), i.e. the 
vector b, is uniformly random. In order to prove it, it is sufficient to notice that Mx = bis SAT if and 
only if b is in the image of H. Since the dimension of the image of H is rank(EI) = L — dimker(]H[), 
this happens with probability 2^-<ii"^ker(H)/2M. on the other hand, Z(EI) = 2'i™kcr(H)_ Equation 
([3]) follows by taking expectation with respect to H. 

Within the rXOR ensemble, for 7 < js{k), Z{M) is tightly concentrated around 2^~^ , implying 
P{EIx = 6 is SAT} « 1. Viceversa for 7 > 7s(A;), typically Z{W) = (here = denotes 

equality to leading exponential order), with ^(7) > 1 — 7 and therefore the formula is SAT with 
exponentially small probability. 

Furthermore, as long as the non-homogeneous solution has at least one solution, its number of 
solution is independent of 6, and is given by Z{M). Even more, the set of solutions is an affine space 
obtained by translating the linear space of solutions of the homogeneous system. In other words, 
conditional to the problem being solvable (which happens with high probability for 7 < 7s (/s)) the 
frustrated and unfrustrated ensemble are essentially equivalent. 

An important novelty within the KacXOR ensemble is that the linear system ([T|) is always 
UNSAT with high probability if we let L — > 00 with 7, R fixed. More precisely, we expect that 

P{Mx = ^ is SAT} = e--^^(^'^) , (5) 

for some A(7, R) strictly positive and non-decreasing in 7. This phenomenon was already oserved 
in [3l for a related model. The basic reason for this behavior is that small subsets of nearby rows 
of H have a fair chance of being linearly dependent. If this is the case, the corresponding linear 
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subsystem is unsolvable with finite probability. In the large L limit, the expected number of such 
substructures is of order L, and the probability that none is present is exponentially small, thus 
leading to the above behavior. 

It is not difficult to prove the above statement, and indeed to prove lower bounds on the rate 
A(7, i?) by combinatorial techniques. The basic idea is to select a particular type of substructure 
that leads to unsatisfiability and estimate the probability that no such substructure is present. The 
simple such substructure is obtained when two lines of H coincide but the corresponding entries of 
b do not. Using Janson inequality this yields 

A(7, R) > Ki^^ - K27'(l - i^07')~' , (6) 

where 




Such a lower bound is easily seen to be strictly positive for 7 small enough. We refer to Appendix 
lAl for a derivation of this formula. 

Notice that the lower bound in Eq. ^ vanishes as when R — > 00. We expect the same 

behavior to hold for the actual exponent as long as 7 is below the (mean-field) satisfiability threshold 
7s (A;). Explicitely 

K{^,R)=M^)/R^ + 0{l/R^+^), (9) 

where Ai(7) | +(X) as 7 j ^s{k). 

In the following we shall avoid dealing with the above phenomenon by focusing directly on the 
unfrustrated ensemble: this will enable us to use efficient linear algebra techniques for numerical 
computations. There are several justifications for doing this: 

1. The two ensembles become equivalent in the Kac limit, which is our main concern here. 

2. We are interested in the long distance properties of the model, rather than in the effect of 
small substructures. We think that the two decouple for large R. 

3. Even if the frustrated ensemble is with high probability unsatisfiable, one can always consider 
'almost satisfying' configurations. Equivalently, one can study the Boltzmann measure for 
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FIG. 2: Ground state entropy density in the thermodynamic hmit (/'/j(7) = hm^^oo 4'l,r{i), cf- Eq. (jlOp . 
for the standard and improved ensembles. Here k = 2> and 7 = 0.4. The horizontal line marks the i? — > 00 
limit 0(7) = 0.6. 

the energy ([2]) at a small non-vanishing temperature T. We expect the effect of small 
frustrated substructures on the thermodynamics to be small, and indeed vanishing as 
for large R. 

In this perspective, we shall introduce an improved ensemble which reduces the effect of small 
substructured, while keeping the large R behavior unchanged. This is particularly convenient in 
numerical simulations. 

Ideally, one would like to consider a uniform ensemble conditioned on some class of substructures 
being absent. In practice it can be excedingly difficult to sample matrices H from such a conditional 
ensemble. We shall define the improved ensemble by the following sequential procedure. First 
generate the random set F C [L] by letting i ^ F independently for each i = with 
probability 7. The set F will index lines of H as above. Then we choose a uniformly random 
ordering («(1), . . . , i{M)) of the elements of -F, and generate the corresponding lines of EI following 
such an order. For each t = 1, . . . , M we try to generate the line of HI indexed by i{t) by drawing 
its k non-zero elements uniformly at random in {i{t) — R, . . . ,i{t) + R}. If the newly generated 
line has k — 1 or k non-vanishing entries in common with one of the previously generated lines 
{i(l), . . . ,i(t — 1)}, we reject it and re-sample it. We repeat the trial-rejection step for at most 100 
times. If no valid line is generate within this round, the whole system generated so-far is rejected 
and the procedure is re-initiated from scratch. 

We shall refer to the first ensemble introduced above as to the standard, whenever it will be 



standard 
improved 
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necessary to distinguish it from the improved ensemble. In Fig. [2]we compare the R ^ oo behavior 
of the ground state entropy for the two ensembles. Although the limits clearly coincide, the 
improved ensemble is close to it even for R = 5. 



IV. GROUND STATE ENTROPY 



The simplest thermodynamic quantity that is relevant for the study of the unfrustrated KacXOR 
problem is the ground state entropy, i.e. the logarithm of the number of solutions of the linear 
system. Let us denote by Z(M.) the number of solutions of the linear system ([1]) for a random 
binary matrix M. Then the average entropy density is defined as 

</>L,ii(7) = ^IElog2^(IHI), (10) 

In order to compare analytical predictions and numerical data it will be convenient to define the 
'subtracted' entropy density 4>l,r{7) = 4>l,b{i) - 0naivc(7), where (/"naivelT) = 1-7- Notice that 
0naivc(7) is the naive prediction that would be obtained by assuming the lines of H to be linearly 
independent. 

Given a matrix H, the corresponding number of solutions takes the form of a partition function 

L 

Z(M) = ^[]Va(Xa-iJ,...,X,+R), (11) 

X a=l 

where (denoting by © the sum modulo 2) 



I(xi^(^)©...®Xi^(„) =0) ifaeF, ^^^^ 
1 if a F. 



Due to the finite interaction range R, Z = Z{M) can be computed through a transfer matrix 
algorithm which recursively computes left and right partition functions, respectively Z^i and Zi^. 
These are indexed hy z = (zi, . . . , Z2r) € {0, l}^"^, and defined as 

i-R 

Z^i{z) = ^ Va(Xa-R,...,a + i?), (13) 



XI. ..Xi a=l 
L 



Z,^{z) = ^ n i'a{Xa-R,---,a + R), (14) 



where we used the notation ^rj'^^^ = {xj+i, . . . , Xj+2_R)- A recursion naturally follows 

Z^(^ij^l){zi,. . . ,Z2r) = ^ \l}i-R+l{zQ,Zi,...,Z2R)Z^i{zQ,...,Z2R-l), (15) 

^0G{0,1} 
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together for the analogous recursion for Zj^. It is clear that the total number of solutions can be 
computed from the constrained partition functions. 

The naive transfer matrix algorithm defined by the recursion (jlSp has complexity that of order 
0(L2^-^). This severely limits the interaction ranges that can be treated with this method: in 
practice we could deal at most with R = 10-^ 11, which is far too small to address issues concerning 
the R ^ oo limit. In order to overcome this problem, we developed a transfer matrix algorithm 
that, while computing exactly the constrained partition functions, exploits the linear structure of 
the problem in such a way to reduce the complexity to Q{LR'^). Thanks to this approach, we were 
able to treat systems with R = 100 or larger. For details on the algorithm we refer to Appendix iBl 

We are interested in the double limit R,L — > oo. We shall consider two procedures to define the 
limit. The first one corresponds to the classical Kac limit, and consists in taking the thermodynamic 
limit upfront to define 

</'i?(7) = lim (pL.ni^) • (16) 

L—^00 

Next, we let it! — > oo. In Appendix ICl we will show that </>_r(7) can be expanded for large R as 
follows 

</'«(7) = 0^°^(7) + ^^<^('^(7) + o(;^) . (17) 

The leading term gives the mean-field limit and coincides with the ground state entropy density 
within the rXOR ensemble [3]. It can be expressed in the form (p^^^j) = max^gp cp^^^^y; ip), 
where 

It is easy to show that the max is achieved for a value of the order parameter if that satisfies the 
equation 93 = 1 — exp{—k(p''^^}. For 7 < 7s(A:), the maximum is at 93 = 0, yielding <j)^^^ (7) = 1 — 7. 
In other words, the the rank of H is smaller than the maximum possible value by a fraction of 
order 1/R. For 7 > Js{k), the maximum is at ip = ^f^, > strictly, and ^^'^^(7) > 1 — 7: the rank 
of H remains strictly smaller than its maximum possible value even as i? — > 00. For instance we 
have 7s (A;) ^ 0.917935 for k = 3. 

The first-order contribution (f>^^\'~f) is related to fluctuation around the saddle point in an 
appropriate path integral representation of Eq. ()lip . Its expression is given in Appendix O 

In Fig. Owe plot the numerical estimates for the subtracted entropy density 0l,/j(7), as obtained 
with our transfer matrix algorithm for R = 25 and several system sizes. Data points are the result 
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FIG. 3: Subtracted entropy density 4)l.r{i) = <t>L.R{l) — 1 + 7 for various values of L and i? = 25 constant. 
We also plot the result of an L ^ cx) extrapolation, and the analytical mean-field prediction (7) — 1 + 7 
(continuous line). 

of averaging over 1000 realizations of EI with k = 2>. The same statistics and value of k will 
be used in the other numerical experiments below: we shall omit mentioning it again. Further, 
unless otherwise stated, we will keep using the improved ensemble. We also show the result of an 
1/L extrapolation to L = cxd. The control of the thermodynamic limit is quite good (although 
corrections at moderate values of L are large). It is clear that the L = 00 extrapolation is not 
compatible with the mean field prediction, and that 1/R corrections must be taken into account. 

Figure H] shows the result of such an L — > 00 extrapolation for several values of R. The data 
seem to approach the mean field prediction (p^'^'^i'j) — l + 7asi?— >oo, although the approach is 
rather slow. 

In order to better study the large-i? limit, for 4 values of 7 we computed the ground state 
entropy for a wide range of R. The result is compared in Figure [5] with the asymptotic expression 
(jl7p . In this case we used the standard ensemble which presents larger 1/R corrections (computing 
the first order correction (f)^^"^ within the improved ensemble is technically much more difficult). It 
turns out from the analysis in Appendix ICl that (^^^Ht) = ^'^^ 7 < 7s (^) while (p^^^'j) 7^ for 
7 ^ ls{k). Our data confirm this behavior. Further, although 0{R~'^) contributions are rather 
large, the leading 1/R correction to mean field does indeed match the analytical prediction. 
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FIG. 4: Subtracted entropy density in the thermodynamic hmit: (f>B.il) ~ 4'r{i) — 1+ 7 for various values 
of i?, together with the mean field prediction 0^"' (7) — 1 + 7 (continuous line) . 
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FIG. 5: Ground state entropy density (t>R{^), extrapolated to the thermodynamic limit, versus the inverse 
interaction range l/(2i?+ 1) for various values of 7. Straight lines correspond to the analytic prediction, cf. 
Eq. ini). 



The second limit we shall consider is L, i? — > 00 with £ = L/R kept fixed. We thus define 



(19) 



This is the mean-field limit for a system of finite size. The limit can be computed exactly by 
maximizing an appropriate action functional over a position-dependent order parameter. More 
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precisely we have (p^i'y) = max^ ^[v']) where ip : [0,£] ^ M is the order parameter, and 



By differentiating with respect to 93, we obtain the mean-field equation 



dz . (20) 



99(z) = l-^ p expj-^ p i^{z + u + vf~^dv^du. (21) 

We refer to Appendix O for a derivation of these formulae and limit ourselves to discuss their 
consequences here. 

In the case of a homogeneous order parameter ip{z) = 93 independent of z, Eq. (j2ip is satisfied 
if (/p if a solution of the standard mean field equation, 93 = 1 — exp{— ^799'^"^}. The action (|20p 
then reduces to the mean field free energy Eq. (jlSp . 

In the general case the order parameter '^{z) has a simple interpretation. Consider the linear 
system M.x = mod 2, and let i G {1, . . . Then, one of the following must happen: either 

Xj = in all of the solutions; or xi = in half of the solutions and Xj = 1 in the other half. We 
shall call xi (or, sometimes, i) a frozen variable in the first case, and a free variable in the second 
one. Given z G [0,£], the number of frozen variables with i £ [Rz,R{z + dz)] in a typical random 
linear system from our ensemble, is about R^{z) dz. Equivalently, the probability for Xi, i = [Rz\, 
to be frozen converges to (p{z). 

We shall come back to this interpretation in the next Section, while using it here to derive 
the appropriate boundary conditions for Eq. (]2ip . If the linear system is defined with periodic 
boundary conditions, then we have to use periodic boundary conditions in Eq. (j2ip as well, namely 
(p{z + i) = (p{z). If on the other hand we adopt fixed boundary conditions with respect to the 
reference solution x^^^ = 0, then we have to impose ^p{z) = 1 for z <0 and z > £ in Eq. ()2ip . As a 
consequence, the homogeneous solution is no longer relevant in this case. 

Once the boundary conditions have been estabilished, Eq. (j2ip can be solved numerically by 
iteration (after discretizing 2 on a sufficiently fine mesh). In the regime in which multiple fixed 
points exist, the relevant one is obtained by maximizing the action. 

The result of such a computation is compared in Fig. [6] with the outcome of numerical simula- 
tions. The agreement is good already at moderate interaction ranges. The main effect of a finite 
size is a decrease in the number of solutions due to the fact that variables close to the boundary are 
more highly constrained (and thus more likely to be frozen). This effect is accurately reproduced 
by the analytical calculation. 



14 




0.4 0.5 0.6 0.7 0.8 0.9 1 

1 

FIG. 6: Subtracted entropy density (t>L.R{l) = 4>l.r{i) — 1 +7 as a function of 7 for several at £ ~ L/R — 50 
fixed. The continuous line corresponds to the analytical prediction (/'|(7) — 1 + 7 in the R —> 00 limit. 

V. POINT-TO-SET CORRELATION FUNCTION 



As we have seen in the previous Section, the thermodynamic behavior of the KacXOR ensemble 
at finite R carries several traces of the mean field limit. Here we want to investigate some structural 
features of the uniform measure over solutions of the linear system: 

fi{x) = I I(EI X = 0) = I JJ M^a-R, Xa+R) • (22) 

a 

In particular, we want to understand whether the mean field ergodicity breaking transition shows 
up in the long range correlations of this measure, as predicted within the mosaic state scenario. 

It is expected that the long range order emerging at a glass transition cannot be probed through 
ordinary point-to-point correlations functions, and that point-to-set correlation functions have to 



be used instead 



111 ]. These can be defined through the following "experiment" [l^ (we refer here 



to the one-dimensional case we are studying). Consider a large sample L ^ R, let i be a node in 
its bulk: i ^ R, L — i ^ R, and x* a 'reference' configuration sampled from the measure /u( • ). 
Then fix some 1 < L <^ L, and consider a second configuration that is forced to coincide with x* 
on sites j with |j — i| > L, and free otherwise, and compute the probability that Xi x\. The 
expectation of this probability with respect to x* and the sample realization yields the desired 
point-to-set correlation. In formulae, if we let B(i, L) = {j : \i — j| < L} be the box of size 2L + 1 
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FIG. 7: Correlation G{n\ L, R, 7) between the boundary of a box of size 2L + 1 = 2£R+ 1 and a point in its 
interior (at distance n = zR from the center). In the right frames: blow-up of the region near the boundary. 
The continuous line (partially hidden by data points) corresponds to the analytic prediction obtained by 
solving Eq. (|2T|) . 

around i, and B(i,L) its complement, we define 

G(Z,i?,7) ^ E{1 - 2^^|B(. £)(x, ^ ^ ~p} . (23) 

Here the thermodynamic limit L ^ 00 \s assumed to be taken at the outset, E denotes expectation 
both with respect to the matrix H and the reference configuration x*, and the redefinition 1 — 2(- • • ) 
is for future convenience. 

The linear structure of our problem implies two simplifications. First, the conditional probability 
appearing in Eq. ([23]) is indeed independent of x* (that can be 'gauged away'). Therefore we can fix 
X* = and eliminate the expectation over the reference configuration x*. The resulting conditional 
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FIG. 8: Point-to-set correlation length in units of the interaction range R. The continuous line corresponds 
to the analytic prediction for R ^ oo and diverges at the glass transition 7s (fc = 3) « 0.917935. 

measure is just the distribution of a system with fixed boundary conditions as discussed in the 
previous Section. This impUes a second simpHfication (already noticed above). The conditional 
probability /^j|B(iZ)("^* '^I— B( Z) ~ '^^^ take value 1/2 (if Xj is 'free') or (if it is 'frozen'). We 
thus get 

G(l, R, 7) = P2;{ x^^-^ is free } . 

Here Pj^ denotes probability with respect to a matrix EI with 2L -|- 1 columns and fixed boundary 
conditions. In fact it is interesting to generalize the above definition and consider the correlation 
between any point inside the box of size 2L and its boundary 

G{n; L, R, 7) = xj^^^^ is free } . 

The original definition is recovered for n = 0. 

We expect G{n; L, R, 7) to be close to 1 when n approaches the boundaries of the box (i.e. 
n ~ L or n ~ —L) and to decrease in the interior. If the box is large enough, it will approach 
its thermodynamic value, independent of the boundary condition, near the center (for n ~ 0). In 
Figure [7] we show the outcomes of a numerical calculation of G{n; L, R,^) for several values of its 
parameters. 

We are particularly interested in the mean field limit. This is obtained by defining 




G(z;^,7)= lim G{n = Rz; L = Rl, R,'y) , 
-R— »oo 



(24) 



17 



that is by measuring lengths in terms of the interaction range and letting i? — > cx). In agrement 
with the interpretation of the previous section, we expect G{z;i,^) = ip{z), where (p{z) solves 
Eq. (|2ip with boundary condition ip{z) = 1 for z < —£, and for z > i. The comparison with 
numerical data in Fig. [7] is satisfactory although the convergence to the i? — > oo limit gets slower 
and slower as 7s(3) ~ 0.917935 is approached. 

The point-to-set correlation function can be used to define a correlation length, namely the 
smallest box size such that the correlation is below a pre-estabilished constant e. Here we will 
choose^ e = 1/2. In formulae 

4(7, R) = min{ £ : G{L = Ri, R,j) <l/2}. (25) 

An analytical prediction in the — > oo limit can be obtained by solving Eq. (I2ip with boundary 
conditions f{z) = 1 for z [—£,£]. The resulting length can be shown to diverge at 7s as l^il^R = 
oo) ~ (Ts — 7)~^5 ill agreement with the mosaic picture (indeed 2(7) ~ (7s — 7) close to the 
transition). 

In Fig. [8] we compare this prediction with the estimates from numerical simulations at finite R. 
These two are clearly consistent, although the convergence is rather slow in the critical regime. 

VI. DISCUSSION 

We defined a simple ensemble of constraint satisfaction problems (more precisely, an ensemble 
of linear problems over integers modulo 2), with one-dimensional Kac structure. The model is 
exactly soluble for infinite interaction range — > 00 and exhibits in this limit a glassy phase with 
an exponential number of pure states and a SAT-UNSAT transition. 

Mean field theory (as interpreted within the mosaic picture) seems to describe the behavior of 
the system at moderately large R. Indeed we were able to get quantitative predictions by taking into 
account the principal modifications of naive mean-field theory, namely a position-dependent order 
parameter, and 1/R corrections. In particular we checked for the first time the divergence of the 
mosaic length scale in a concrete model, by comparing the the result of a controlled approximation 
(large R limit) with exact numerical calculations. 

We think the KacXOR model can be a useful playground for many ideas developed in the 
physics of glasses. Among several interesting research directions, one might consider: [i) Studying 

® Any strictly positive constant below the Edwards- Anderson parameter (in this case given by the largest solution 
oi — 1 — exp{— ^773''"^^}) should provide an equivalent definition. 
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the frustrated ensemble (corresponding to an inhomogeneous hnear system); (ii) Introducing a 
non-vanishing temperature and studying the corresponding Boltzmann distribution; (in) Studying 
the behavior of Glauber dynamics, and in particular the relation between relaxation time and 
mosaic length scale. 

On a different theme, ensembles of random constraint satisfaction problems have been recur- 
rently used to test heuristic algorithms [l3j]. Such tests have limited scope because in practical 
applications instances are often structured. It might be insightful therefore to consider ensembles 
with some tunable 'structure parameter', such as the interaction range R in the present model. 

APPENDIX A: COUNTING SMALL SUBSTRUCTURES 

Consider the random linear system Mx = b defined in Section [Til If two lines i,j G in EI are 
equal, while the corresponding entries in b (namely bi and bj) are different, then the system has no 
solution. We call such a pair a 'bad pair,' and will write Bij = 1 if (i,j) is bad, and Bij = 
otherwise. Therefore 

P {Mx = 6 is SAT} < P{n(i J) [Bij = 0] } , (Al) 
where the intersection ranges over i,j such that i < j < i + 2R + 1 — k. Let B = Bij be the 



number of bad pairs. In order to bound the right hand side above, we use Janson's inequality 
which implies 

P {Mx = 6 is SAT} < exp {-E[B] + A/2(l - e)} . (A2) 

Here 

€ = sup E[Bij], A= ^[B^jBlm]■ (A3) 

where the sum over (ij) ~ (Im) runs over all the couples of distinct pairs (ij) and (Im) such that 
Bij and Bi^ are not independent. 

It is easy to realize that both K[B] and A are of order Q{L) since they are sums of ©(L) positive 
terms. Since we are only interested in the coefficient of the order L term, we shall always consider 
pairs (ij) in the bulk. Then we have 

E[B.,] = ^i^P^+\-|'-^1V (A4) 
2^5+')' V / 
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The factor 7^ has to be included for having i,j G F (the two equations must present), 
C^'^^k^^^''^) / C^k^^)'^ probabihty that the two hnes in H coincide, and 1/2 is the proba- 

bihty that 6j 7^ bj 

Since the above expression is maximized for = 1, we have e = -ft'oT^i with Kq as in Eq. ([7]). 

Further, by summing over i,j we obtain K[B] = Ki^f'^L + 0(1) for L 00. 

As for the term A, the only non- vanishing contribution comes form the case in which there are 
three distinct indices among I, m}. If we denote by /i^ the line inedexed by n in H, we get 

^ = i E ^i^'-?' ^ € F and /i, = = h^} . (A5) 
i<j<i 

The factor 3 counts the number of different couples of pairs in and 1/4 is the probability 

that the corresponding entries in b are different. By computing the above probability and summing 
over i,j,l we get A = K2j'^L + 0(1), thus proving Eq. ([6]). 



APPENDIX B: POLYNOMIAL TRANSFER MATRIX ALGORITHM 

Consider the constrained partition function (I13p and the corresponding transfer matrix recursion 
()15p . In this Appendix we shall consider only left-to-right iterations and drop the arrow — > in 
subscripts. We shall further set n = 2R and use the vector notation = (a^j+i, . . . ,Xjj^n)- 

The constrained partition function Zi[z) is just the number of solutions in an inhomogeneous 
linear system, obtained by retaining the lines of H with index in {1, . . . , i — R} (and the correspond- 
ing equations), and adding the n equations Xi^n+i = zi, . . . , Xi = Zn- As a consequence, for all 
the choices of z such that this linear system has a solution, it has the same number of solutions as 
corresponding homogeneous system. Further, the number of solutions of the homogeneous system 
is a power of 2 (because it is the size of a linear space over Z2). Finally, the vectors z for which a 
solution exists form a linear space. Therefore, there exists a binary matrix Aj and an integer 
such that 

f 2*» ifAii'=0, 
Zi{z) = (Bl) 
I otherwise. 

The matrix Aj can always be chosen as an nxn matrix by eventually eliminating linearly dependent 
lines. 

We therefore reduced the memory requirements from 0(2") to G(n^). We have now to show 
that the Aj and can be computed recursively in polynomial time as well. Consider the recursion 
([15]) and let a, = (a^^i, . . . , aj^^+i) be the binary vector defined as follows, lii — R + l^F (the 
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new equation added in the recursion is not present), then = 0. Otherwise, aij = i/j_/f,+i^j_2/?+j 
(oj encodes the newly added hne of H, properly shifted). Then define the (n + 1) x (n + 1) matrix 
Bj as follows 

/ \ 

Ai : 





(B2) 



Denote by the first column of Bj (i.e. a column vector) and by Bj the (n+ 1) x n matrix formed 
by its last n columns. By using Eq. (1B1|) the recursion (llSp can be written as 



Z,+i(z)= 2*» Yl l(b»zo + Bji'= O) . (B3) 

20G{0,1} 

Let us now consider two cases: 

• If bj = 0, then we get immediately the form (jBip for Zj_|_i, by letting ^j+i = <^j + 1, and 
Aj_|_i the matrix obtained by eliminating linear dependencies among rows of Bj. 

• If bj 7^ 0, then there exists at least one vector bj of dimension (n + 1) such that b^'bj = 1 
mod 2. The only non- vanishing term in the sum (jBSj) is therefore obtained for zq = — b^BjZ 
mod 2. Substituting this value of zq, we obtain that Zj+i can again be written in the 
form (IBip . The new matrix Aj+i is obtained by eliminating linearly dependent rows from 
(1 — bjb^)Bj, while the number of solutions is updated by 'I'j+i = $j. 

In practice we found more convenient to reduce Bj in upper triangular form by gaussian elimination 
before computing Aj+i and <I>j+i as just described. 

The initialization of the above recursion depends on the choice of boundary conditions. When 
using fixed boundary conditions with reference solution x^''-* = 0, we set Aq = 1 and <&o = 0. 

It is clear that the above procedure can be implemented in a time that is polynomial in the 
interaction range. Indeed the most complex operation to be performed, consists in eliminating 
linearly dependent lines from the matrix Bj, or (1 — bjb?')Bj. This can be done via gaussian 
elimination in time 0{R^). The total complexity is therefore 0{LB?). 
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APPENDIX C: ANALYTICAL CALCULATIONS 
1. Replicas 

In order to compute the ground state entropy and the point-to-set correlation function, we 



shall follow the replica approach, see [25l ]. Each site i € {1, . . . ,L} thus carries n binary variables 
Xi = (xj, . . . , x") corresponding to the n replicas. 

Let us consider first a particularly simple instance consisting of a single equation labeled by 
i ^ F and 2R + 1 variables on sites j E {i — R, . . . ,i + R}. Denote by Ci{x) the fraction of nodes j 



such that Xj = x. In formulae 



c.(x) = ^^ E Kxj = x). (CI) 

j=i-R 

The probability that a randomly sampled equation at i (with range {1 — R, . . . ,i + R}) is satisfied 
by all of the n replicas, is a function of Cj, call it ¥k,R{ci). For large R it is easy to show that 

¥kAc) = hie) + Q) Mc) - h-2ic)] + 0(i?-2) , (C2) 

where 

n 

Jz(c)= llHx1®---(Bxt = 0)c{xi)---c{xi). (C3) 

X1...X1 a=l 

Consider now the full linear system and the partition function (jlip . We shall implicitly assume 
periodic boundary conditions in order to lighten the notations. Fixed boundary conditions can be 
recovered by properly constraining the expressions that we will derive. It follows from the above 
that 

L 

= E - ^ + 7lFfc,fl(c.)] . (C4) 

{xi} i=l 

Next we introduce two variables Ai(x), Ci{x) indexed by x G {0, 1}'" for each i E {1, . . . using 
the identity 

1 = / dQ(x) / — exp{-A,(x)(Q(x) - c,(x))} . (C5) 
J J-ioo 27rz 

This allows to perform the sum over Xi in Eq. (IC4p . If we expand the resulting expression for large 
R we get, after some lengthy but straightforward calculations, 

E{Z-} = J dQ(x) y^'" exp { - (2i? + 1) So[c, X] - S,[c] + 0{1/R)} , (C6) 



22 



where 

So[c, X] 

Si[c] 



i=l K X 



log 



E 



gEjGD(i) 2R+1 



2^ + 1^ 'V2y l-7 + 7Jfc(ci) ' 



,(C7) 
(C8) 



and we introduced the notation D{i) = {j : |i — j| < R}. 



2. Mean field limit 



In the i? — > oo limit, the integral ()C6P is dominated by the saddle points of So[c, A]. We neglect 
for the moment the correction given by Si[c], and look for a saddle point of the type 

1 1 

Ci{x) = ipi 6^^gg + —{l-ipi), Ai(x) = uji 5g^gg + — w° , (C9) 

where xq = (0,0, . . . , O) and 6g^g is the n-dimensional Kronecker delta function. There are several 
reasons for this Ansatz: (i) The algebra of functions of the form f{x) = /o + /ife,^o is closed; (ii) 
This ansatz is known to give the correct thermodynamic behavior for the rXOR ensemble (i.e. in 
the mean-field limit); {in) Although it is replica symmetric, it yields the correct one-step replica 
symmetry breaking physics (it is a peculiarity of XORSAT that replica symmetry can be explicitely 
broken) . 

By substituting in Eq. ()C7p and letting n — 0, we get 5o[c, A] = AQ[ip,uj]n log 2 + O(n^) where 
MV,0J] = {^(1 - - ^^(1 - '^*) - e-^^'e°W . (CIO) 

i=l 

By differentiating with respect to ipi and coi we get the saddle point equations 

"^'"-^"^rTT ^ e^^'^DW^, u;i = kj^t^. (Cll) 

jGD(i) 

The second of these equations can be used to eliminate uJi from the action. 

If we finally assume that tpi only depends on i on a scale of order R, we can set (with an abuse 
of notation) ipi = ip{i/R) and let i? ^ oo with L = £R, thus getting Eqs. (fTOl) to (f2T]) . 



3. 1/R corrections 

In computing the l/R corrections we shall assume the system to be homogeneous. For instance 
we can think of imposing periodic boundary conditions, or letting L ^ oo at the outset. As a 
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consequence, in the leading order calculation we have ipi = ip independent of i and thus AQ[ip, uj] = 
L0(°)(7;¥') with (t)'^^\%'p) 

as in Eq. (jlSp . Hereafter 99 will denote a solution of the mean field 
equation ip = \ — exp{— ^799'^""'^}, and we let uj = kj(p^~^. 

There are two contribution to order 1 /R. The first one comes from the correction to the action 
and is easy to compute. Substituting our Ansatz in Eq. (ICSh and proceeding as in the previous 
Section we get Si[c\ = Ai[ip] nlog2 + O(n^), where 

The second term comes from gaussian fluctuations around the saddle point. Let c|(x), A*(x) 
denote the saddle point (1C9P and define 

a (f ) = c* (x) + i^i (f ) , Xi (x) = A* (f ) + (f) . (CIS) 

By expanding So[c, A] to second order around its saddle point, we get 

So[c,X] = So[c*,X*] + ^-^^J2Yl (C14) 
^A{x,y)5iji^i{x)i^j{y) + (5a;,^(5ij[fi(x)Ci(?7) + - B{x,y)KR{i - j%{x)^j{y)^ , 

where 

' {2R + 1- |i|)/(2i? + 1)2 for |i| <2R+1, 
otherwise. 
The coefficients appearing in Eq. ()C14p have the form 



(C15) 



A{x, y) = Ai+ A-zS^^g + ^3['55f,^o + + M5x,xo^y,xo ' (C16) 
B{x, y) = Bi + B25s,g + -Bsl^x.xo + ^y^^] + ^4^x,xo%,xo ) (C17) 



where (defining z = 1 — 7(1 — 2 ")(1 — tp^)) 



■i A:(^- 1)7(1 -/-^) + l^V^ 
^2 = --fc(A;-l)7/-2^ (C19) 



^1 = - 1)7(1 - /-') + T^(l - /-')' , (C18) 

1 



z 

A, = ^fcV^(l-'^'"')v^'"S (C20) 

^4 = , (C21) 

and (defining Aq = (e"' - 1 + 2")-^ and A = 1 - 2"^°) 

Bi = -2^Al , ^2 = Ao , (C22) 

^3 = -AAo, 54 = A(1-A). (C23) 
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The quadratic form in Eq. ()C14p can be diagonalized both in position space (by Fourier trans- 
form) and in rephca space (all the eigenvectors have the form (^(x) = Co^x,xo+Ci)- One can therefore 
perform the gaussian integral, and let n ^ 0. Putting this contribution together with the action 
correction, cf. Eq. ()C12p . we finally get the entropy correction 

- - - - /;;{io.(i - . ^} |i . ,0.4, 

where 

a = k{k-l)j^''-^e-^ , (C25) 

b = -k{k - l)7/-2e-'"(l - e"^) + k^^^^'^'^-^^e-^ . (C26) 
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